237 research outputs found

    Lightweight thermal compensation technique for MEMS capacitive accelerometer oriented to quasi-static measurements

    Get PDF
    The application of MEMS capacitive accelerometers is limited by its thermal dependence, and each accelerometer must be individually calibrated to improve its performance. In this work, a light calibration method based on theoretical studies is proposed to obtain two characteristic parameters of the sensor’s operation: the temperature drift of bias and the temperature drift of scale factor. This method requires less data to obtain the characteristic parameters, allowing a faster calibration. Furthermore, using an equation with fewer parameters reduces the computational cost of compensation. After studying six accelerometers, model LIS3DSH, their characteristic parameters are obtained in a temperature range between 15 °C and 55 °C. It is observed that the Temperature Drift of Bias (TDB) is the parameter with the greatest influence on thermal drift, reaching 1.3 mg/°C. The Temperature Drift of Scale Factor (TDSF) is always negative and ranges between 0 and −400 ppm/°C. With these parameters, the thermal drifts are compensated in tests with 20 °C of thermal variation. An average improvement of 47% was observed. In the axes where the thermal drift was greater than 1 mg/°C, the improvement was greater than 80%. Other sensor behaviors have also been analyzed, such as temporal drift (up to 1 mg/h for three hours) and self-heating (2–3 °C in the first hours with the corresponding drift). Thermal compensation has been found to reduce the effect of the latter in the first hours after power-up of the sensor by 43%

    Factory Oriented Technique for Thermal Drift Compensation in MEMS Capacitive Accelerometers

    Get PDF
    Capacitive MEMS accelerometers have a high thermal sensitivity that drifts the output when subjected to changes in temperature. To improve their performance in applications with thermal variations, it is necessary to compensate for these effects. These drifts can be compensated using a lightweight algorithm by knowing the characteristic thermal parameters of the accelerometer (Temperature Drift of Bias and Temperature Drift of Scale Factor). These parameters vary in each accelerometer and axis, making an individual calibration necessary. In this work, a simple and fast calibration method that allows the characteristic parameters of the three axes to be obtained simultaneously through a single test is proposed. This method is based on the study of two specific orientations, each at two temperatures. By means of the suitable selection of the orientations and the temperature points, the data obtained can be extrapolated to the entire working range of the accelerometer. Only a mechanical anchor and a heat source are required to perform the calibration. This technique can be scaled to calibrate multiple accelerometers simultaneously. A lightweight algorithm is used to analyze the test data and obtain the compensation parameters. This algorithm stores only the most relevant data, reducing memory and computing power requirements. This allows it to be run in real time on a low-cost microcontroller during testing to obtain compensation parameters immediately. This method is aimed at mass factory calibration, where individual calibration with traditional methods may not be an adequate option. The proposed method has been compared with a traditional calibration using a six-sided orthogonal die and a thermal camera. The average difference between the compensations according to both techniques is 0.32 mg/°C, calculated on an acceleration of 1 G; the maximum deviation being 0.6 mg/°C

    Self-Calibration Technique with Lightweight Algorithm for Thermal Drift Compensation in MEMS Accelerometers

    Get PDF
    Capacitive MEMS accelerometers have a high thermal sensitivity that drifts the output when subjected to changes in temperature. To improve their performance in applications with thermal variations, it is necessary to compensate for these effects. These drifts can be compensated using a lightweight algorithm by knowing the characteristic thermal parameters of the accelerometer (Temperature Drift of Bias and Temperature Drift of Scale Factor). These parameters vary in each accelerometer and axis, making an individual calibration necessary. In this work, a simple and fast calibration method that allows the characteristic parameters of the three axes to be obtained simultaneously through a single test is proposed. This method is based on the study of two specific orientations, each at two temperatures. By means of the suitable selection of the orientations and the temperature points, the data obtained can be extrapolated to the entire working range of the accelerometer. Only a mechanical anchor and a heat source are required to perform the calibration. This technique can be scaled to calibrate multiple accelerometers simultaneously. A lightweight algorithm is used to analyze the test data and obtain the compensation parameters. This algorithm stores only the most relevant data, reducing memory and computing power requirements. This allows it to be run in real time on a low-cost microcontroller during testing to obtain compensation parameters immediately. This method is aimed at mass factory calibration, where individual calibration with traditional methods may not be an adequate option. The proposed method has been compared with a traditional calibration using a six tests in orthogonal directions and a thermal chamber with a relative error difference of 0.3%

    Direction of Arrival Estimation with Microphone Arrays Using SRP-PHAT and Neural Networks

    Get PDF
    The Steered Response Power with phase transform (SRP-PHAT) is one of the most employed techniques for Direction of Arrival (DOA) estimation with microphone arrays, but its computational complexity grows when the search space increases. To solve this issue, we propose the use of Neural Networks (NN) to obtain the DOA from low-resolution SRP-PHAT power maps

    Emotional classification of music using neural networks with the MediaEval dataset

    Get PDF
    The proven ability of music to transmit emotions provokes the increasing interest in the development of new algorithms for music emotion recognition (MER). In this work, we present an automatic system of emotional classification of music by implementing a neural network. This work is based on a previous implementation of a dimensional emotional prediction system in which a multilayer perceptron (MLP) was trained with the freely available MediaEval database. Although these previous results are good in terms of the metrics of the prediction values, they are not good enough to obtain a classification by quadrant based on the valence and arousal values predicted by the neural network, mainly due to the imbalance between classes in the dataset. To achieve better classification values, a pre-processing phase was implemented to stratify and balance the dataset. Three different classifiers have been compared: linear support vector machine (SVM), random forest, and MLP. The best results are obtained with the MLP. An averaged F-measure of 50% is obtained in a four-quadrant classification schema. Two binary classification approaches are also presented: one vs. rest (OvR) approach in four-quadrants and binary classifier in valence and arousal. The OvR approach has an average F-measure of 69%, and the second one obtained F-measure of 73% and 69% in valence and arousal respectively. Finally, a dynamic classification analysis with different time windows was performed using the temporal annotation data of the MediaEval database. The results obtained show that the classification F-measures in four quadrants are practically constant, regardless of the duration of the time window. Also, this work reflects some limitations related to the characteristics of the dataset, including size, class balance, quality of the annotations, and the sound features available

    ENSA dataset: a dataset of songs by non-superstar artists tested with an emotional analysis based on time-series

    Get PDF
    This paper presents a novel dataset of songs by non-superstar artists in which a set of musical data is collected, identifying for each song its musical structure, and the emotional perception of the artist through a categorical emotional labeling process. The generation of this preliminary dataset is motivated by the existence of biases that have been detected in the analysis of the most used datasets in the field of emotion-based music recommendation. This new dataset contains 234 min of audio and 60 complete and labeled songs. In addition, an emotional analysis is carried out based on the representation of dynamic emotional perception through a time-series approach, in which the similarity values generated by the dynamic time warping (DTW) algorithm are analyzed and then used to implement a clustering process with the K-means algorithm. In the same way, clustering is also implemented with a Uniform Manifold Approximation and Projection (UMAP) technique, which is a manifold learning and dimension reduction algorithm. The algorithm HDBSCAN is applied for determining the optimal number of clusters. The results obtained from the different clustering strategies are compared and, in a preliminary analysis, a significant consistency is found between them. With the findings and experimental results obtained, a discussion is presented highlighting the importance of working with complete songs, preferably with a well-defined musical structure, considering the emotional variation that characterizes a song during the listening experience, in which the intensity of the emotion usually changes between verse, bridge, and chorus

    Impact of Thermal Variations and Soldering Process on Performance and Behavior of MEMS Capacitive Accelerometers

    Get PDF
    This work presents an analysis of performance and multiple parameters of microelectromechanical system (MEMS) capacitive accelerometers in applications with large thermal variations and the effects of the soldering process on them. The proposed test consists of a thermal characterization phase performed between two mechanical calibrations. The test is performed on multiple units before and after the soldering process. Mechanical, thermal, and performance parameters are analyzed and compared among all tests. The ranges and relative variations of these characteristics, both during the soldering process and the tests, have been identified and characterized individually. Mechanical bias shows greater variability than other parameters in both the soldering process and thermal tests. On the contrary, the thermal characteristic parameters show great stability in all cases. The thermal drifts, which are the main source of error in environments with large thermal variations, are successfully compensated for using a model with only two characteristic parameters. According to the observed behaviors, negative thermal variations (toward cooler temperatures) might be more suitable for thermal calibration due to other effects, such as creep, taking place primarily at hotter temperatures. The creep effect at constant temperature is analyzed according to the Kelvin–Voigt model with promising results, and a possible link between thermal drift and creep effects is presented. Performance results are calculated in multiple compensation scenarios. Using the proposed compensation techniques, the average maximum error is reduced from over 70 to 7 mg and the uncertainty is also reduced to a third of the initial value

    A Geometric Deep Learning Approach to Sound Source Localization and Tracking

    Get PDF
    La localización y el tracking de fuentes sonoras mediante agrupaciones de micrófonos es un problema que, pese a llevar décadas siendo estudiado, permanece abierto. En los últimos años, modelos basados en deep learning han superado el estado del arte que había sido establecido por las técnicas clásicas de procesado de señal, pero estos modelos todavía presentan problemas para trabajar en espacios con alta reverberación o para realizar el tracking de varias fuentes sonoras, especialmente cuando no es posible aplicar ningún criterio para clasificarlas u ordenarlas. En esta tesis, se proponen nuevos modelos que, basados en las ideas del Geometric Deep Learning, suponen un avance en el estado del arte para las situaciones mencionadas previamente.Los modelos propuestos utilizan como entrada mapas de potencia acústica calculados con el algoritmo SRP-PHAT, una técnica clásica de procesado de señal que permite estimar la energía acústica recibida desde cualquier dirección del espacio. Además, también proponemos una nueva técnica para suprimir analíticamente el efecto de una fuente en las funciones de correlación cruzada usadas para calcular los mapas SRP-PHAT. Basándonos en técnicas de banda estrecha, se demuestra que es posible proyectar las funciones de correlación cruzada de las señales capturadas por una agrupación de micrófonos a un espacio ortogonal a una dirección dada simplemente usando una combinación lineal de las funciones originales con retardos temporales. La técnica propuesta puede usarse para diseñar sistemas iterativos de localización de múltiples fuentes que, tras localizar la fuente con mayor energía en las funciones de correlación cruzada o en los mapas SRP-PHAT, la cancelen para poder encontrar otras fuentes que estuvieran enmascaradas por ella.Antes de poder entrenar modelos de deep learning necesitamos datos. Esto, en el caso de seguir un esquema de aprendizaje supervisado, supone un dataset de grabaciones de audio multicanal con la posición de las fuentes etiquetada con precisión. Pese a que existen algunos datasets con estas características, estos no son lo suficientemente extensos para entrenar una red neuronal y los entornos acústicos que incluyen no son suficientemente variados. Para solventar el problema de la falta de datos, presentamos una técnica para simular escenas acústicas con una o varias fuentes en movimiento y, para realizar estas simulaciones conforme son necesarias durante el entrenamiento de la red, presentamos la que es, que sepamos, la primera librería de software libre para la simulación de acústica de salas con aceleración por GPU. Tal y como queda demostrado en esta tesis, esta librería es más de dos órdenes de magnitud más rápida que otras librerías del estado del arte.La idea principal del Geometric Deep Learning es que los modelos deberían compartir las simetrías (i.e. las invarianzas y equivarianzas) de los datos y el problema que se quiere resolver. Para la estimación de la dirección de llegada de una única fuente, el uso de mapas SRP-PHAT como entrada de nuestros modelos hace que la equivarianza a las rotaciones sea obvia y, tras presentar una primera aproximación usando redes convolucionales tridimensionales, presentamos un modelo basado en convoluciones icosaédricas que son capaces de aproximar la equivarianza al grupo continuo de rotaciones esféricas por la equivarianza al grupo discreto de las 60 simetrías del icosaedro. En la tesis se demuestra que los mapas SRP-PHAT son una característica de entrada mucho más robusta que los espectrogramas que se usan típicamente en muchos modelos del estado del arte y que el uso de las convoluciones icosaédricas, combinado con una nueva función softargmax que obtiene una salida de regresión a partir del resultado de una red convolucional interpretándolo como una distribución de probabilidad y calculando su valor esperado, permite reducir enormemente el número de parámetros entrenables de los modelos sin reducir la precisión de sus estimaciones.Cuando queremos realizar el tracking de varias fuentes en movimiento y no podemos aplicar ningún criterio para ordenarlas o clasificarlas, el problema se vuelve invariante a las permutaciones de las estimaciones, por lo que no podemos compararlas directamente con las etiquetas de referencia dado que no podemos esperar que sigan el mismo orden. Este tipo de modelos se han entrenado típicamente usando estrategias de entrenamiento invariantes a las permutaciones, pero estas normalmente no penalizan los cambios de identidad por lo que los modelos entrenados con ellas no mantienen la identidad de cada fuente de forma consistente. Para resolver este problema, en esta tesis proponemos una nueva estrategia de entrenamiento, a la que llamamos sliding permutation invariant training (sPIT), que es capaz de optimizar todas las características que podemos esperar de un sistema de tracking de múltiples fuentes: la precisión de sus estimaciones de dirección de llegada, la exactitud de sus detecciones y la consistencia de las identidades asignadas a cada fuente.Finalmente, proponemos un nuevo tipo de red recursiva que usa conjuntos de vectores en lugar de vectores para representar su entrada y su estado y que es invariante a las permutaciones de los elementos del conjunto de entrada y equivariante a las del conjunto de estado. En esta tesis se muestra como este es el comportamiento que deberíamos esperar de un sistema de tracking que toma como entradas las estimaciones de un modelo de localización multifuente y se compara el rendimiento de estas redes recursivas invariantes a las permutaciones con redes recursivas GRU convencionales para aplicaciones de tracking de fuentes sonoras.The localization and tracking of sound sources using microphone arrays is a problem that, even if it has attracted attention from the signal processing research community for decades, remains open. In recent years, deep learning models have surpassed the state-of-the-art that had been established by classic signal processing techniques, but these models still struggle with handling rooms with strong reverberations or tracking multiple sources that dynamically appear and disappear, especially when we cannot apply any criteria to classify or order them. In this thesis, we follow the ideas of the Geometric Deep Learning framework to propose new models and techniques that mean an advance of the state-of-the-art in the aforementioned scenarios. As the input of our models, we use acoustic power maps computed using the SRP-PHAT algorithm, a classic signal processing technique that allows us to estimate the acoustic energy received from any direction of the space and, therefore, compute arbitrary-shaped power maps. In addition, we also propose a new technique to analytically cancel a source from the generalized cross-correlations used to compute the SRP-PHAT maps. Based on previous narrowband cancellation techniques, we prove that we can project the cross-correlation functions of the signals captured by a microphone array into a space orthogonal to a given direction by just computing a linear combination of time-shifted versions of the original cross-correlations. The proposed cancellation technique can be used to design iterative multi-source localization systems where, after having found the strongest source in the generalized cross-correlation functions or in the SRP-PHAT maps, we can cancel it and find new sources that were previously masked by thefirst source. Before being able to train deep learning models we need data, which, in the case of following a supervised learning approach, means a dataset of multichannel recordings with the position of the sources accurately labeled. Although there exist some datasets like this, they are not large enough to train a neural network and the acoustic environments they include are not diverse enough. To overcome this lack of real data, we present a technique to simulate acoustic scenes with one or several moving sound sources and, to be able to perform these simulations as they are needed during the training, we present what is, to the best of our knowledge, the first free and open source room acoustics simulation library with GPU acceleration. As we prove in this thesis, the presented library is more than two orders of magnitude faster than other state-of-the-art CPU libraries. The main idea of the Geometric Deep Learning philosophy is that the models should fit the symmetries (i.e. the invariances and equivariances) of the data and the problem we want to solve. For single-source direction of arrival estimation, the use of SRP-PHAT maps as inputs of our models makes the rotational equivariance of the problem undeniably clear and, after a first approach using 3D convolutional neural networks, we present a model using icosahedral convolutions that approximate the equivariance to the continuous group of spherical rotations by the discrete group of the 60 icosahedral symmetries. We prove that the SRP-PHAT maps are a much more robust input feature than the spectrograms typically used in many state-of-the-art models and that the use of the icosahedral convolutions, combined with a new soft-argmax function that obtains a regression output from the output of the convolutional neural network by interpreting it as a probability distribution and computing its expected value, allows us to dramatically reduce the number of trainable parameters of the models without losing accuracy in their estimations. When we want to track multiple moving sources and we cannot use any criteria to order or classify them, the problem becomes invariant to the permutations of the estimates, so we cannot directly compare them with the ground truth labels since we cannot expect them to be in the same order. This kind of models has typically been trained using permutation invariant training strategies, but these strategies usually do not penalize the identity switches and the models trained with them do not keep the identity of every source consistent during the tracking. To solve this issue, we propose a new training strategy, which we call sliding permutation invariant training, that is able to optimize all the features that we could expect from a multi-source tracking system: the precision of the direction of arrival estimates, the accuracy of the source detections, and the consistency of the assigned identities. Finally, we propose a new kind of recursive neural network that, instead of using vectors as their input and their state, uses sets of vectors and is invariant to the permutation of the elements of the input set and equivariant to the permutations of the elements of the state set. We show how this is the behavior that we should expect from a tracking model which takes as inputs the estimates of a multi-source localization model and compare these permutation-invariant recursive neural networks with the conventional gated recurrent units for sound source tracking applications.<br /

    Desarrollo de una red social y herramientas para cantantes mediante una aplicación Android

    Get PDF
    El trabajo realizado consiste en la realización de una aplicación Android para cantantes llamada Singvibes que les permite analizar su afinación en tiempo real. También permite encontrar a otros miembros e intercambiar con ellos para ofrecer a los usuarios herramientas a la vez técnicas como un detector de tono y de la afinación, y la oportunidad de mejorar al contacto de otros cantantes. La realización consistió en dos proyectos. Por una parte, se realizó la implementación de un servidor desplegado en una instancia EC2 en el cloud de Amazon, desde la configuración hasta el desarrollo y la implementación propia del paradigma MVC mediante un framework personal llamado LeafStormMVC. Por otra parte, se desarrolló la aplicación Android con la implementación de un detector de tono utilizando el Fast-Lifting-Wavelet Transform algorithm (documentado en la memoria). A la hora de entrega, el proyecto está en fase Beta (funcionalidades básicas de análisis y de red social implementadas). El documento ajuntado presenta todas las etapas importantes del proyecto, desde el establecimiento de los objetivos y requisitos hasta la implementación y la planificación futura con la integración de la aplicación en el Google Play, que seguirá una mejora continua de tipo ágil
    • …
    corecore